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Abstract 

We study multipath parameter estimation from orthogonal frequency division multiplex signals 
transmitted over doubly dispersive mobile radio channels. We are interested in cases where the trans- 
mission is long enough to suffer time selectivity, but short enough such that the time variation can be 
accurately modeled as depending only on per- tap linear phase variations due to Doppler effects. We 
therefore concentrate on the estimation of the complex gain, delay and Doppler offset of each tap of 
the multipath channel impulse response. We show that the frequency domain channel coefficients for 
an entire packet can be expressed as the superimposition of two-dimensional complex sinusoids. The 
maximum likelihood estimate requires solution of a multidimensional non-linear least squares problem, 
which is computationally infeasible in practice. We therefore propose a low complexity suboptimal 
solution based on iterative successive and parallel cancellation. First, initial delay/Doppler estimates 
are obtained via successive cancellation. These estimates are then refined using an iterative parallel 
cancellation procedure. We demonstrate via Monte Carlo simulations that the root mean squared error 
statistics of our estimator are very close to the Cramer-Rao lower bound of a single two-dimensional 
sinusoid in Gaussian noise. 

I. Introduction 

In wireless communications, reflection and diffraction of the transmitted radio signal results in 
the superimposition of multiple complex-scaled and delayed copies of the signal at the receiver. 
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This type of channel is commonly referred to as a multipath channel. In some instances, the 
multiple copies add constructively, and in others destructively resulting in multipath fading. 
When the coherence bandwidth of the channel is smaller than the bandwidth of the radio signal 
then the fading is termed frequency selective [1]. We assume the reader is familiar with standard 
wide-sense stationary uncorrelated scattering models, for an overview see e.g. [2]. 

Orthogonal frequency division multiplexing (OFDM) is a transmission strategy specifically 
designed to combat frequency selective channels with relatively low receiver complexity [3- 
5]. In OFDM, the signal bandwidth is divided into several non-overlapping (hence orthogonal) 
narrowband subcarriers where the width of each subchannel is chosen such that it is approxi- 
mately frequency non-selective. Thus only a single tap equaliser per subchannel is required to 
compensate for the multipath fading. Together with the use of the fast Fourier transform (FFT), 
this results in a low complexity way to handle frequency- selective channels. As such, OFDM is 
now the basis of many current and emerging wireless communications standards, see [6, 7] for an 
overview. Many of these standards are targeted for outdoor mobile applications, e.g. 802. lip [8]. 
Mobility causes the multipath channel (and hence frequency selectivity) to change with time. If 
the mobility is fast enough compared to the symbol rate, then the channel impulse response may 
vary significantly within an OFDM packet. Extensive field trials have shown that this is indeed the 
case for the transmission of 802.11 OFDM signals in vehicular environments [9]. Time- varying 
multipath channels such as these are commonly termed doubly-dispersive [10-12]. 

In general, a realization of a doubly selective multipath channel time-varying impulse response 
can be modeled in continuous time as 



where c(t, r) is the response at delay r to an impulse at time t, where S(t) denotes the Dirac-delta 
function. The a p (t) are the time- varying complex amplitude (magnitude and phase) of tap p, with 
delay t p . The number of resolvable multipath components is P. The a p (t) may aggregate many 
more unresolvable multipath components, typically resulting in Ricean or Rayleigh statistics for 
these parameters. 

Note that for sufficiently short time durations, mobility-induced Doppler shifts manifest as 
linear variations of the phase of a p (t) with time. In this paper, we consider the special case 
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P =i 
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where the OFDM packet duration is short enough such that we can model the channel as 



where a p , r v and v v respectively denote the complex gain, delay and Doppler frequency (relative 
to the nominal carrier frequency) of tap p. These parameters are all assumed to be constant 
over the duration of an OFDM packet. In a physical sense, this implies that changes in the 
relative distance and velocity between the transmitter, receiver and scatterers are negligible over 
the duration of an OFDM packet. This model is consistent with the geometric- stochastic model 
presented in [13] for short observation windows, and has been validated experimentally in [9]. 

In this paper, we concentrate on joint estimation of a p , r v and v p of the multipath components 
assuming perfect knowledge of the transmitted OFDM symbols. This is a practical assumption, 
e.g. a transmitted training/pilot signal, or the receiver is able to decode the signal without error 
(via a forward error correction code). Estimation of these parameters is useful in a number 
of areas: channel sounding and characterisation; channel prediction; reducing channel state 
information for feedback in adaptive communications; and radar. Estimation of these parameters 
in the OFDM setting has been studied previously by a number of researchers from both the 
communications and radar fields. Channel estimation via an approximate maximum likelihood 
parameter search algorithm was proposed by Thomas et al. [14]. Their iterative algorithm was 
based on an approximation of the maximum likelihood function, where the multipath gain 
values are substituted with their least-squares estimates. In the radar community, estimation 
of delay /Doppler is vital for determination of target range and velocity. Berger et al. [15] 
studied the problem of extracting the target range/velocity information from the OFDM signal 
in a passive multi-static radar system [16] using digital audio/video broadcasted signals as 
illuminators of opportunity. They set up the problem as a sparse estimation problem to use 
recent results from compressed sensing [17]. In particular, they employ the orthogonal matching 
pursuit algorithm [18, 19], which is in an iterative algorithm that successively removes previously 
estimated multipath components from the received signal to estimate new components. Note that 
Taubock et al. [20,21] also consider compressed sensing to estimate the OFDM channel coeffi- 
cients. However, their interest is not in the estimation of delay s/Doppler, but in the frequency /time 
channel coefficients. 

In this paper we begin with a continuous-time model of the transmitted OFDM signal and 
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derive the received matched filtered signal from first principles. Assuming the delay-spread of the 
channel does not exceed the cyclic-prefix and the pass-band of the receive/transmit filters exceed 
the signal bandwidth (with negligible pass-band ripple), we show that the resulting frequency 
domain channel coefficients can be represented as the superimposition of two-dimensional (2- 
D) complex sinusoids, where each 2-D frequency is proportional to the delay and Doppler of 
each multipath component. Similar observations have been made by Wong and Evans [22, 23] 
although without detailed justification. Under a similar setting they consider estimation using 
only OFDM pilot symbols and propose channel prediction algorithms based on the estimation 
of channel parameters via a rotational invariance technique. Using this method, they reformulate 
the problem as a one-dimensional estimation problem. 

Parameter estimation of 2-D sinusoids in a general setting has been studied extensively many 
years prior to the work of Wong and Evans [22,23]. Estimation methods and the Cramer- Rao- 
Lower-Bound (CRLB) for the single 2-D sinusoid case was investigated by Chien [24]. Kay 
and Nekovei [25] proposed a low complexity estimator that operates on the phase of the noisy 
2-D sample data. For the estimation of the superposition of multiple 2-D sinusoids: Bresler and 
Macovski employ a 2-D version of Prony's method [26]; Rao et al. [27] use a similar polynomial 
rooting approach; and recently Kliger and Francos [28] consider maximum-likelihood estimation 
with a maximum a-posteriori (MAP) model order selection rule for the case where the number 
of sinusoids is unknown. 

In this paper we concentrate purely on the estimation of the complex amplitude, delay and 
Doppler of each multipath tap, assuming the number of taps is known. Our results can be 
straightforwardly extended to the case where the number of taps is unknown using well-known 
model-order selection methods [29]. The maximum-likelihood approach requires the solution to 
a multi-dimensional nonlinear least-squares estimation problem [30] and hence has complexity 
that is prohibitive in practice [26-28]. We propose a low-complexity algorithm based on a two- 
stage process: first, an initial estimation; followed by a refinement proceedure. In the same 
spirit as [14, 15, 28] the initial estimation algorithm is based on successive cancellation, whereby 
multipath components are subtracted from the original signal after they are detected. In each 
iteration, the delay /Doppler is estimated using periodogram search [24, 25] via a 2-D bisection 
algorithm. The multipath complex amplitudes are then obtained via standard linear least-square 
estimation [31]. Moreover, we show that this secondary problem can be written in terms of 
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the ambiguity function [32]. Once initial estimates have been obtained, we then propose an 
iterative refinement algorithm based on parallel cancellation. Each iteration of the refinement 
involves subtracting all multipath components from the received signal except the component 
of interest, which is re-estimated using the 2-D bisection algorithm. This refinement process 
yields significant improvements over the standard successive cancellation approach. We show 
via Monte-Carlo simulations that this refinement algorithm achieves performance very close to 
the CRLB for single 2-D sinusoid estimation. 

The remainder of our paper is organised as follows. In Section II we state the system model 
and derive from first principles the received match filtered frequency domain OFDM symbols. In 
Section III we derive the transmit signal ambiguity function. Then in Section IV we present our 
proposed estimation algorithm and enhanced refinement process. Simulation results are presented 
in Section V. Finally, concluding remarks are given in Section VI. 

II. System Model 

Consider a K subcarrier OFDM system, where packets of length L OFDM symbols are 
transmitted. Let X E C LxK denote a packet of complex OFDM symbols. Thus X^k, the I, kth 
element of X, denotes the Zth symbol transmitted on subcarrier k, for I — 1, . . . , L and k = 
1, . . . , K. In practical OFDM systems, a certain number null subcarriers are employed to simplify 
receiver design [5]. To incorporate this feature, we let K denote the set of null subcarrier indices. 
Thus, Xi jk = for all k E K. and I = 1,...,L. For all other subcarriers, i.e. k /C, we 
assume X k ^ E X, where X C C is an arbitrary complex constellation. These symbols are drawn 
randomly, independently and uniformly from X, which is normalised to have unit average energy. 
Thus E[|X ijfc | 2 ] = 1 for k £ JC and E[X lik X* tTn ] = for any n ^ / or m ^ k. The receiver 
is assumed to have complete knowledge of the transmitted symbols X^ k , e.g. a pilot/training 
signal, or from the feedback of error-free decoder decisions. 

Let x(t) = ^2f =1 xi(t) denote the complex baseband continuous-time transmitted OFDM 
signal, where 

v KL k=i 

is the Zth OFDM symbol, Td is the OFDM symbol duration (seconds), l/T is the subcarrier 
spacing (Hz), T cp = T d — T is the cyclic prefix duration (seconds), and w(t) is a windowing 
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function such that 

\w(t) 0<t<T d 
w(t) = I (3) 

I otherwise, 

and J Td \w(t)\ 2 dt — 1. A simple choice of windowing function is w(t) = l/^/T~ d . Note that 
the assumption w(t) = for t £ (0,T d ) is not necessarily required, but we assume this for 
simplicity. In practice, (2) is implemented in the discrete time domain via the inverse discrete 
Fourier transform (IDFT) [5]. 

We assume transmit and receive filter impulse responses g T (^) and g^(t) respectively, and 
let g(t) = /^<7t(u)<7r(* — u)du denote the combined transmit/receive filter response.Thus, 
using (1), we write the overall channel response as 

/oo P 
g(u)c(t, r-u)du = J2 ape-^gir - r p ). (4) 
-°° P =i 

Application of the overall channel response (4) to (2) plus additive Gaussian white noise (AWGN) 
yields the received continuous-time baseband signal, 

/oo 
Xi>(t-T)h(t,T)dT + z(t), (5) 

where z(t) = z(t — u)g^{u) du, and z(t) is an additive white Gaussian noise (AWGN) 
process. Assuming perfect OFDM symbol synchronism, the receiver discards the cyclic prefix 
and performs the matched filter to the transmitted sinusoids, i.e. 

Y t , k = -L= r y(t)w*(t -(I- i) T(| ) e -iar(*-i-W2j)(*-r w )/T (ft> (6) 

VKL JT zp +(i-i)T d 

fovk = \,...,K and I — 1, . . . , L. Note that in practice, Yi yk is obtained via the discrete Fourier 
transform (DFT) [5]. We assume the pass-band of filters g T and g R exceed the signal bandwidth. 
In addition, we assume max p r p < T cp and max p \u p \ < l/T, so that inter-symbol interference 
(ISI) and inter-carrier interference (ICI) can be considered negligible. Under these assumptions, 
in Appendix I we show that the matched filtered output (6) can be written in matrix form 

Y = H X + Z, (7) 

where denotes the element- wise (Hadamard) product, Y € C LxK is the received matrix of 
filtered noisy OFDM symbols, Z e C LxK is a matrix of independent identically distributed 
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(i.i.d.) zero mean complex Gaussian random variables with variance a 2 , and H e C LxK are the 
frequency domain channel coefficients, 

H t , k = ape-^W-Ve-^- 1 '^/ 2 ^. (8) 
P =i 

In relation to (7), we define the signal-to-noise ratio (SNR) as snr = E [\\X \\ 2 ] /(La 2 ) = 
(K — |/C|)/o- 2 , where || ■ || denotes the Frobenius norm [33]. 

From inspection of (8), we see that it is simply the superimposition of 2-dimensional (2-D) 
complex exponential signals. We may also express H as the matrix product 

H = *(jy)diag(a)$ t (r), (9) 

where diag(a) denotes a P x P diagonal matrix with diagonal entries a — (ai, . . . , ap), and 

*„(!/) = M"p) = e-^- 1 ^ (10) 

^kAr) = Mr P ) = ^ rik - 1 - lKm}T ' /T , (ID 

for p = 1, . . . , P, I = 1, . . . , L and A; = 1, . . . , K. As we shall see later, the separation of the 
parameters in this matrix form will simplify the development of our estimation algorithms. 
In the analysis that is to follow, we will make use of the vectorised version of (7). Let 

y = vcc(Y) = (Y hl , . . . , Y Ltl , Y lj2 , Y Lj2 , Y l>K , . . . , Y LjK )', and z = vec(Z) then 

y = n(T,v,X)a + z. (12) 
where the KL x P matrix f2 is a function of r, u, and X as follows, 



x Ltl Mvi)<Pl(n) x L ,i^i(^)^(r 2 ) 

X 1>2 l/>1 (^l)02( T l) ^1,2^1 (^2)02( r 2) 



X L< li) L {v P )(t)* 1 {Tp) 
X h2 4>l {Up) 02 ( T p) 



^,2^(^1)02^1) ^L,2^l(^2)02( r 2) ••• X Lj2 lp L (lS P )(f)*(Tp) 



^i,^i(^i)0^(n) ^1,^1(^2)0^(7-2) ••• X^Ktp^lSp^iTp) 



\X Lt K^L{viW K { T l) X L , K lP L (v 2 )(f>* K (T 2 ) 

where (•)* denotes the complex conjugate. 



, (13) 



Xl^lM&kM J 
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III. Ambiguity Function 

The ambiguity function of the transmitted signal x(t) is defined as the inner product of the 
signal with a delayed, frequency shifted version of itself [32] 



/oo 
x{t)x*{t-T)e' j2nut dt. 
■oo 



(14) 



In the context of OFDM communications the ambiguity function has been used often as a tool 
for pulse design and optimisation [10, 11]. In radar systems, the ambiguity function plays an 
important role in determining target range and velocity resolution [32]. In this section we derive 
the ambiguity function of the transmitted OFDM signal and highlight important characteristics 
that will affect the delay/Doppler estimation problem. Moreover, as we shall see later, parts of 
the estimation problem can be written succinctly in terms of the ambiguity function. In this 
direction, substitution of (2) into (14) yields the following result. 

Theorem 1 (OFDM Ambiguity Function). For a general windowing function w(t), let A w (v, r) 
denote its ambiguity function. The ambiguity function of the OFDM signal (2) is, 

x Aw (t + (1' -l)T d ,v+(k' -k)/T). (15) 

Due to the quadruple summation, numerical evaluation of (15) is computationally demanding. 
However, under some common practical design assumptions we can make further simplifications. 
Firstly, the windowing function is usually designed such that A w (t, v) for |t| > T d . Window 
functions of the form (3) will have this property. Thus the terms when V ^ I in the summation 
of (15) are approximately zero. Secondly, although A w (t,u) cannot be considered negligible 
for \v\ > (k' — k)/T, since X^ k are i.i.d. with zero mean and unit variance by assumption 
(except for the null subcarriers, which have zero power), the summation of terms over k ^ k! 
will approach zero for large K and/or L. In addition, since we are primarily concerned with 
delay and Doppler in the region < r < T cp and \v\ <ti 1/T, where there is negligible variation 
in A w {t,v), we may assume A w (t,u) p^ A w (0,0), which only introduces a constant phase 
offset, since |A W (0,0)| 2 = I. Hence we ignore the complex scaling affects of A w (t,u) and 
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approximate (15) as 

A x (r, v) « A x (r, ^) = ^ E I^IV^ 1 - 1 )^^*- 1 -^)^. (16) 

k,l 

In light of (15), note that (16) is also the ambiguity function when ISI and ICI can be considered 
negligible. 

For phase shift keying (PSK) modulation, \Xi :k \ = 1 with no null subcarriers, i.e. /C = 0, then 
using the geometric summation formula, the ambiguity function (16) further simplifies to 

A x (t, v) « e^^ r/T sinc (jrK^j sine (nLuT d ) , (17) 

where sinc(x) = sin(x)/a;. Note that (17) is also equal to the expectation of (15) over the 
transmitted symbols X\^, i.e. it is the expected ambiguity function, E [A^r, v)\, for an arbitrary 
signal constellation X with zero mean and unit variance. In many OFDM standards, subcarrier 
k = [K/2\ + 1 is a null subcarrier. For these special cases (17) becomes, 

A x (t, v) « e - jnKT/T coa [n(K/2 + l)(r/T)] sinc(7rriir/(2r))sinc(7ri^r d L) (18) 

From the above expressions (17) and (18), we see that the sine terms introduce sidelobes in 
the ambiguity function. Interestingly, the sidelobes are de-coupled in time and frequency. As an 
example, the ambiguity function of the 802.11a standard [34, 35] is plotted in Fig. 1, i.e. K = 53 
subcarriers, with a null subcarrier at k = [K/2\ + 1, Td = 8 fisec and T = 6.4 /isec. From 
Fig. 1, as predicted by (18) we see that increasing L improves the Doppler resolution, but not 
the delay resolution, which is dependent only on K and the subcarrier spacing (1/T). 

IV. Multipath Parameter Estimation 

Our primary objective is to estimate a = (a 1: . . . , a P ), r = (ti, . . . , t p ) and v = (z/ 1; . . . , v P ) 
in (1) from the received noisy symbols Y (6) given perfect knowledge of X. In this section, 
without loss of generality, for brevity of notation, we assume the OFDM system has no null 
subcarriers, i.e. /C = 0. Using (12), the maximum likelihood (ML) approach is to solve the 
following 

(a, f, if) = arg min \\y — f2(r, if, X)a|| 2 , (19) 

a,T,u 

which is a non-linear least squares minimisation problem. The computational complexity can be 
reduced by replacing a with its least squares estimate. That is, for a given r and v the ML 
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estimate of a is a linear least squares minimisation problem, which has solution [31] 

a = (ntn) _1 nt 3/> (2 o) 

where we have dropped the dependence of X, r and u for brevity of notation. Hence substitut- 
ing (20) for a in (19) results in the reduced problem 

(f,t>) = argmax y t n(n t n) _1 fi t y. (21) 

It is known that problems (21) and (19) are equivalent, i.e. (21) followed by (20) is also the ML 
solution [26,36]. Unfortunately, (21) is in general multimodal, rendering the multidimensional 
search for a global extremum computationally prohibitive. 

Before we begin our reduced complexity suboptimal solution, let us first make some interesting 
observations about (21). Let R = fl^fl and w = f^t/. From (13), it is straightforward to show, 

Rij = KLA x {ji - Tj, Vj - Vi) (22) 

^^^(yer)^,), (23) 

for i,j = \,...,P, where ififa) and 0(rj) denote column % of the matrices \I> and $ respectively, 
and A* denotes the element-wise conjugate of the matrix A. Thus, rather than performing com- 
putation of R = fi^fi (requiring on the order of PKL(KL+l)/2 complex multiply-accumulate 
operations) using standard matrix operations, to reduce complexity, R can be evaluated using the 
ambiguity function via a look-up table. Moreover, for the special case of PSK modulation, (17) 
implies we only need the evaluation of a sinc(x) function. 
For the special case of P = 1, the ML solution (21) becomes 

(r 1 ,h)= argmax (Y © X*) <^(n) f , (24) 

after which the corresponding complex gain ML estimates can be determined using (20), 

1 -^(^(yor)^). (25) 



KL 

We see that the solution to (21) corresponds to the maximum absolute value of the 2-D peri- 
odogram [37]. Moreover, the CRLB for the estimation of a single tap multipath channel can be 
written as [24] 

"■wi^i^S' ™ftw*& wjhr) £?- <26) 
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Note that Kay and Nekovei [25] proposed a low complexity weighted phase averager estimator 
as an alternative to solving (24). 

If we were to use (24) when multiple taps are present (P > 1), then 

tf(v) (Y X*) 4>(t) = KLY^ a p A x (v p - v ,t-t p ) + [Z X*] </>(r), 

v 

which is the superimposition of complex scaled, delay and frequency shifted ambiguity functions, 
plus an additive Gaussian noise term. We see that detection and estimation of a particular tap 
will be significantly affected by the main lobe and sidelobes from the ambiguity functions of the 
remaining taps. This motivates a successive cancellation approach whereby the signal contribution 
in Y induced by a multipath tap is removed after it is detected, thus allowing subsequent taps 
to be detected and estimated. Successive cancellation algorithms have found widespread use in 
a number of communication scenarios requiring the recovery of multiple superimposed signals. 
In particular, interference cancellation (successive and parallel forms) is the basis of practical 
low-complexity multi-user decoding algorithms, which attain close to single-user bit error rate 
performance [38]. In our case, the superimposed signals are not signals from multiple users, 
but time/frequency shifted versions of the same signal. However the same principle can still be 
applied, and as we will see later, achieves performance close to the CRLB of a single tap channel 
(provided the taps are sufficiently separated in either delay or Doppler). In this direction, the 
first algorithm we propose is based on successive cancellation and is employed to find an initial 
estimate of the delay, Doppler and complex gain of each tap. The second algorithm we propose 
is based on parallel cancellation and is employed to refine the initial estimates. Integral to both 
of these algorithms is a search for the largest absolute value of a 2-D periodogram [37], and 
we propose a low-complexity 2-D bisection algorithm for doing this. A detailed description of 
each of these algorithms is given as follows. 

A. Initial Estimation 

Algorithm 1 describes our proposed initial successive cancellation procedure. First we initialise 
the residual error matrix equal to the received noisy OFDM symbols Y. At iteration 
p — 1,2, ... ,P: we find r p and v, p that correspond to the maximum absolute value squared of 
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the 2-D periodigram of E^; construct the p x p matrix 



R 



i P 



yR P i 



Rr, 



Rr, 



hp2 ■ ■ ■ J^pp j 

and length p vector = (wi,w 2 , ■ ■ ■ , w p ) substituting the delay and Doppler estimates = 
(fi, ... , t p ) and = (z>!, . . . , z> p ) into (22) and (23); re-estimate the length p complex gain 
vector = (cii . . . , a p ) = (R^^w^; and finally subtract the signal contributions of all p 
estimated multipath components from Y, which becomes the residual error matrix for the next 
iteration. 

Typically Algorithm 1 will estimate the multipath starting from the strongest to the weakest 
tap, i.e. | Si| > 1 62 1 > ••• > \a P \. Thus, for the case when P is unknown, an obvious exit 
criterion is to stop once \a p \ < 7, where 7 is a threshold that determines the minimum tap 
energy. Alternatively, the Algorithm can be modified to incorporate a model order selection 
rule [29]. 

Note that two simple modifications can be made to Algorithm 1 to further reduce complexity. 
Firstly, in the main loop, rather than subtracting all multipath contributions of the previously 
estimated components from the original signal Y to obtain the residual error E^, simply subtract 
the contribution of the current estimate from the residual error of the previous iteration E^ 1 ^, 



a p p) if)(i> p )<fi\f p ) 



X. Secondly, rather 



i.e. line 6 can be replaced with E^ = E^ 1 ^ 
than operating on Y, one could apply the algorithm on the zero-forcing estimate of H, i.e. 
Hi,k — Yi,kXtk/\Xi t k\- Thus, in Algorithm 1, one simply replaces Y with H and the Hadamard 
product with X in lines 3 and 6 is no longer required. To summarize, we can make the following 
complexity-reducing modifications to Algorithm 1. Line 1: E^ = H, Line 3: (f p , v p ) = 



arg max T v 



^\v)E {p) ct)(T) , and Line 6: E {p+l) = H - *(j> (p) )diag(a (p) )$ t 



B. Estimation Refinement 

It is quite reasonable to rely solely on Algorithm 1 to estimate the delay/Doppler. Indeed 
similar approaches have been employed in [14, 15,28], but without any detailed comparison to 
theoretical bounds. We find that the performance of Algorithm 1 is hampered by interference from 
undetected taps, which as we will see later, introduces a floor in the root mean squared (RMS) 
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error performance. Therefore we propose a refinement process based on parallel cancellation 
whereby for each iteration, all multipath components are removed except for the component of 
interest, that is subsequently re-estimated. This refinement procedure is described in detail in 
Algorithm 2, where f {i) = (fj , . . . , rf), = (z>f } , . . . , vf) and a (l) = (a?, af) denote 
the refined estimates after the i'th iteration, and f ^ = f, z>^ = v and = a are the initial 
estimates obtained from Algorithm 1. In addition, we let fp\ and a!p denote the refined 
estimates at step % with element p element omitted. 

Note that rather than refining for a fixed number of iterations, Algorithm 2 can be easily be 
modified to incorporated an early stopping criterion, e.g. by checking the improvement in the 
residual error ||-E|| 2 . As previously described for Algorithm 1, one could apply Algorithm 2 to 
the zero-forcing estimate of H, i.e. replace Y with H and removing the Hadamard product 
with X in lines 4 and 5. 

C. 2-D Bisection Algorithm 

As mentioned earlier, the maximisation step in line 3 Algorithm 1 and line 5 of Algorithm 2 
can be solved by finding the maximum absolute value of the 2-D periodogram [37]. To perform 
this operation, we propose a 2-D bisection approach described as follows. First we assume 
t p £ (jmm, Tmax) and v v e (zVin, ^max) for all p = 1, . . . , P, i.e. the delay /Doppler of each tap is 
constrained to lie within predefined intervals. Let (r^ n ,Tmax) and (^ m , ^mL) denote the search 
interval at iteration i, and f ^ and denote linearly spaced vectors within these intervals, i.e. 

€ ] = r^L + (m- l)Ar« z>« = u± + (n - l)Ai/» (27) 

for m = 1, . . . , M and n = 1, . . . , N, where Ar« = (r£L - r^J/M and Au^ = (i&L - 
^mL) IN denote the bin spacing at the i'th iteration. For each iteration of the bisection algorithm, 
we find the indices corresponding to the largest peak of ^(v^) [Y X*} <3?(f ^). For the 
next iteration, the search interval is then bisected or reduced to a smaller 2-D region, i.e. 
(2/3 At^, 2/3Az/W), centered at the previous delay/Doppler indices (typically (3 > 1/2). A 
detailed description of the procedure is given in Algorithm 3. Note that for ease of exposition, the 
bisection process completes after a fixed number of iterations iV biscct . The algorithm can easily 
be modified to employ an early stopping criterion, e.g. exit the main loop when Ar^ < e T and 
Az/W < e v to ensure a certain level of delay/Doppler resolution. 
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V. Performance Evaluation 

Performance evaluation is complicated by the fact there are infinitely many possible multipath 
channel realisations and many OFDM system design configurations all of which can have a 
significant effect on the estimator's performance. To reduce our analysis, we focus on OFDM 
systems with similar specifications to the IEEE802.11p standard (as described in Section III). 
In addition, we concentrate on multipath channels typical of outdoor mobile vehicular environ- 
ments [39], i.e. delay spreads not exceeding 200 nsec and Doppler differentials not exceeding 
1000 Hz. For example, at a carrier frequency of 5.9 GHz, this corresponds to a maximum excess 
delay of 60 m and velocity differentials of 51 m/s or 183 km/hr. 

Ultimately, we would like to investigate the estimator's performance for as many different 
multipath channel configurations as possible. However, we find that the performance is signifi- 
cantly affected by the location of the multipath taps in the 2-D delay/Doppler space. When two 
or more taps are too close to each other there is a high probability Algorithm 1 will detect these 
as a single tap. 1 The minimum separation distance is essentially the delay/Doppler resolution of 
the estimator, which is dependent on the main lobe of the ambiguity function, which in turn, is 
dependent on the subcarrier spacing and duration of the OFDM packet (as evidenced in (17)). 
When the components are sufficiently separated, the estimator's performance is dominated by 
AWGN and hence the CRLB (26). 

To separate the above mentioned effects, we conducted Monte Carlo simulations whereby for 
each trial a random set of multipath taps is generated. Whilst these taps are drawn randomly, they 
are not i.i.d., and instead are drawn to ensure a minimum separation in delay and Doppler. This 
is achieved by continually drawing a vector of P delays from an i.i.d. uniform distribution on 
the interval (r min , r max ) until the minimum pairwise distance between the delays is greater than a 
specified At. The delays are then sorted in ascending order. The Doppler offsets are generated in 
a similar fashion on the interval (z/ min , ^ max ), but with no sorting. Note that At < (r max — r min )/P 
and similarly Av < (z/ max — Vmm)/P- Whilst we fix the power delay profile, for each trial, the 
phase of each tap is generated randomly according to a uniform distribution over the interval 
(0, 27r). Once the multipath taps are generated, the frequency domain channel coefficients are 

'in a physical sense, if these closely spaced taps are the result of first order reflections it may imply they are reflections from 
the same object. 
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generated using (8) and the received noisy symbols are generated using (7), where, without loss 
of generality, we assume X^ k = 1. It is important to note how the error statistics were calculated. 
For each trial, RMS error statistics were only collected when all taps are detected, i.e. each tap is 
closest (in Euclidean distance) to a single estimate. Events when this does not occur are counted 
as missed detections, but are not included in the RMS error statistics. This allows us to separate 
error events caused by miss detections due to the transmit ambiguity function. 

In our simulations we considered a P = 3 tap multipath channel, with power delay profile 
Kl 2 = 0, |a 2 | 2 = -10 and |a 3 | 2 = -20 dB, (r min ,r max ) = (0,200) nsec, (iw, iw) = 
(—500,500), minimum delay separation of At = 66.67 nsec and minimum Doppler separation 
of Au = 333.33 Hz. Error statistics were collected from 10 4 trials. 

Fig. 2 shows the miss detection probability for L = 128, 256 and 512 OFDM packet lengths. 
We see that when L = 128, the miss detection probability is greater than 10 percent. As L 
increases the main lobe of the ambiguity function shrinks in the Doppler domain improving the 
resolution of the estimator and hence reduces the miss detection probability. When L = 512, no 
miss detections were recorded for an SNR greater than 5 dB. 

Fig. 3 shows the RMS estimation error results (recalling that this is restricted to instances 
where missed detection does not occur). The square marked curves show the RMS error when 
no refinement is performed, i.e. only Algorithm 1 is employed. In this case a floor in the RMS 
error performance is observed (caused by undetected multipath components in the successive 
cancellation process). When refinement is employed, as shown by the circle marked curves, the 
error floor is significantly reduced. Moreover, as L increases the floor does not occur until very 
high SNRs and the RMS error performance is primarily dominated by the CRLB (26), which 
is shown by the dashed curves. Thus with sufficiently long packet length, Algorthms 1 and 2 
deliver single-tap performance, i.e. are able to accurately cancel the contributions of "interfering" 
taps. 

It is interesting to translate the estimator performance into range/velocity resolution. Consid- 
ering L = 512 and SNR 20 dB, the 3 -standard-deviation values for the —20 dB tap are 9 ns 
and 15 Hz. This corresponds to range resolution of 2.7 m and relative velocity resolution (at 
5.9GHz) of 0.77 m/s (2.7 km/h). This clearly demonstrates the capability to accurately resolve 
quite challenging multipath channels. 
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VI. Conclusion 

In this paper we examined amplitude, delay and Doppler estimation of the multipath channel 
taps from OFDM signal transmission in a doubly selective mobile environment. Under certain 
practical system design and mobile channel assumptions, we showed that the frequency domain 
channel coefficients for an entire OFDM packet can be written as the superimposition of 2-D 
complex sinusoids. The angular frequency of each sinusoid is proportional to the delay and 
Doppler of a particular multipath tap. 

ML estimation of the delay/Doppler requires non-linear least squares minimisation, which is 
computationally infeasible for practical implementation. We therefore proposed a low complexity 
suboptimal estimation method, based on successive cancellation, whereby multipath components 
are removed once they are detected. The complexity reduction results from a simplification 
of the channel model, where time variations manifest only as Doppler frequency offsets for 
each tap. For a single tap channel, this method is maximum likelihood. The performance of 
this successive cancellation approach can be degraded by interference from taps that are yet to 
be detected in future iterations. To remedy this, we proposed a refinement algorithm based on 
parallel cancellation, i.e. all estimated multipath components are subtracted except the component 
of interest, which is subsequently re-estimated. 

The performance of our estimator was shown to be dominated by two effects: separation of the 
multipath taps in the delay/Doppler plane; and noise. When two or more taps are close together 
in the 2-D delay/Doppler space, the estimator may detect these as single tap, resulting in missed 
detections and significantly degrading the RMS error of other detected taps. When the multipath 
taps are sufficiently separated in delay/Doppler the estimator performance is dominated by noise 
and hence the RMS error of the refined estimates are very close the CRLB of a single 2-D 
sinusoid in additive white Guassian noise. We believe the missed detections are caused by the 
transmit ambiguity function: broadness of the main lobe affects the delay/Doppler resolution; 
and sidelobes of components that have not been sufficiently subtracted can mask weaker taps. 
However, a detailed analytic investigation of these effects is beyond the scope of this paper and 
the subject of future work. 

Note that although our results assume delay spreads less than the cyclic prefix, our proposed 
estimator still works well without this restriction. Multipath taps with delay exceeding the cyclic 
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prefix will introduce inter-symbol interference. The estimator views this interference as extra 
noise on the received symbols. Thus, as long as AWGN dominates, this extra interference will 
have negligible effect on performance. 

Appendix I 

Derivation of Receiver Matched Filter Output 
For clarity we repeat the transmitted signal, 

x(t) = = -i= " (!' - l)T d )e^ k '^ K ^- T ^ T . (28) 

v v KL l>,k> 

Application of the channel response (4) to the transmitted signal (28) yields 

/oo 
x(t - T)h(t,r) dr 
-oo 

= -t= a P Xl '> k ' / 9(r - r p )w(t - r - (/' - ^Tje-WjMV-i-WWt-T-T^/T dr + ^ 

= -7= E a^we-W-i-WlWe-'H'*- ^ + -)%, „(t, r p ) + z(t), (29) 
v KL V,k>, P 

where the integral 

/oo 
g(r - rJe-'W-i-WWvit t {I 1 dr, (30) 

-oo 

is simply the convolution of a time/frequency translated filter response g(t) and time shifted 
window function w(t). Using the appropriate properties of Fourier transforms, the Fourier 
transform of (30) can be written as 

SvMt p ) = e-W^-Me-W-^G (j+^l-^j W (f), (31) 

where G(f) and W(f) denote the Fourier transforms of g(t) and w(t) respectively. In prac- 
tical OFDM systems the passband bandwidth of G(f) is typically greater than K/T and the 
bandwidth of W(f) is typically less than l/T, e.g. for the simple case w(t) = then 
W(f) = V^de~- 77r ^ Td sinc(7r/T rf ). Moreover, in many OFDM standards the outer subcarriers are 
null subcarriers. Hence assuming negligible passband ripple then G (/ + ^-=^ — |^) 1 for 
k' = 1, . . . ,K and |/| < 1/(2T). Therefore, 

*M/,r p ) » e -^[^] e -W-%(/). (32) 
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Thus, taking the inverse Fourier transform yields 

J — oo 

= e~ j27TTp ^~^w{t - Tp - (I' - l)T d ). (33) 
Substituting (33) into (29) gives, 

y(t) = -jL= a p X,, fc ,e^ 2 ^ fe '" 1 -^/2Jm P /T e ^(, P -^ + #)* 
i',fc',p 

x e -i 2 M^-£) w ( t - Tp - (I' - l)T d ) + z(t), (34) 



The receiver now performs the matched filter to the transmitted sinusoids (less the cyclic 
prefix), i.e. 



i pi-id 

Y l>k = -L= / y(t)w*(t - iTje-iMk-i-WtlKt-T^/T df 

Vkl J Tcp +(i-i)T d 

i',k', P 

x w(t -t p - (I' - l)T d )w*(t -(I- l)T d )e~ j2 ^ Vp+k ^y dt + Z, 

JT m +(l-l)Tj 



^ E a p ^, fc ,e- j2 ^(^-^)e-^( fe '- fc ) T -/ T e- J ' 2 K^ + ^)('- 1 )^ 



",k', P 

k — k r 

x A w [ t p + (f - l)T d , u p + -jr- ) + Z ltk , (35) 



where 



A w (t, v)= f "w{t- T)w*(t)e- j27TUt dt, (36) 

•J T cp 

which resembles the ambiguity function of w(t). 2 In practical OFDM systems, usually max p r p < 
T cp and max p z/ p Cl/T and the windowing function is usually designed such that 
A w (t p + (/' - l)T d , Up + ~ for /c 7^ A;' or / 7^ Hence we may write, 

n* = ^ E v-#) e - J w d a-i)i w(rp? „ p) + Zlfc 

p 

= A E + Z I>fc , (37) 

p 

2 The function A w {t, v) is not quite the ambiguity function of w(t) because of the limits of integration. 
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where a p = e~ j7rKTp / T A w (t p , v p ). With some slight abuse of notation, for the remainder of the 
paper for brevity of notation (and without loss of generality) we will replace a p with a p . Defining 
Hi :k according to (8), we obtain (7). 
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Fig. 1. Magnitude contour plot of the ambiguity function (18) of an 802.11a OFDM system with PSK modulation, T = 6.4 

/Ltsec, Td = 8 /^sec 
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Algorithm 1: Initial estimation via successive cancellation. 



= Y 

for p = 1, . . . , P do 

(f p , u p ) = argmax T;i , (e® X*) 0(r) 

Contruct and using (22) and (23) with fi, . . . ,f p and £>i,...,u p . 



& (p) = ( R (p))-i w (p) 



e (p+i) 
end for 



*(j> (p) )diag(a (p) )$ t ( 



ox 



November 16, 2010 



DRAFT 



FIGURES 



23 



Algorithm 2: Estimate refinement algorithm. 



f (o) = f 5 £(o) = p and a (o) = a 

for i = 1, . . . , TV do 
for p = 1, . . . , P do 



E = Y - 



<fr (T 



-p )diag(ap 

(ff, f>«) = argrnax,,, (E X*) 0(r)| 

end for 

a« = J R- l (f(o j i>W) u ,(f« > i>(0) 

end for 



01 

2 
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Algorithm 3: 2-D Bisection Algorithm. 



Initialise (rj^ Q , t^L) = (r min , r max ) and (u^ n , = (z/ min , z/ max ) 



for i = 1, . . . , iV biscc t do 

Construct f w and £ w using (27). 
Y W = ^t(p(0) [y x*] $(f w ) 

(n,m) = argmax njm |T^L| 2 

ril = fW + /3Ar« r« n = r| } - /?Ar« 

end for 

m ' n 
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(a) L = 128 



(b) L = 128 
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(c) L = 256 



(d) L = 256 




(e) L = 512 



(f) L = 512 



Fig. 3. Three tap estimation root-mean squared (RMS) error. Dashed lines show the CRLB (26) (single-tap estimation). Solid 
lines with squares show RMS error performance of Algorithm 1 only. Solid lines with circles show the RMS error performance 
after refinement Algorithm 2 with 20 iterations. 
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